Parallelizing MCMC with Random Partition Trees

نویسندگان

  • Xiangyu Wang
  • Fangjian Guo
  • Katherine A. Heller
  • David B. Dunson
چکیده

The modern scale of data has brought new challenges to Bayesian inference. In particular, conventional MCMC algorithms are computationally very expensive for large data sets. A promising approach to solve this problem is embarrassingly parallel MCMC (EP-MCMC), which first partitions the data into multiple subsets and runs independent sampling algorithms on each subset. The subset posterior draws are then aggregated via some combining rules to obtain the final approximation. Existing EP-MCMC algorithms are limited by approximation accuracy and difficulty in resampling. In this article, we propose a new EP-MCMC algorithm PART that solves these problems. The new algorithm applies random partition trees to combine the subset posterior draws, which is distribution-free, easy to resample from and can adapt to multiple scales. We provide theoretical justification and extensive experiments illustrating empirical performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information theory tools to rank MCMC algorithms on probabilistic graphical models

We propose efficient MCMC tree samplers for random fields and factor graphs. Our tree sampling approach combines elements of Monte Carlo simulation as well as exact belief propagation. It requires that the graph be partitioned into trees first. The partition can be generated by hand or automatically using a greedy graph algorithm. The tree partitions allow us to perform exact inference on each ...

متن کامل

Tree-Guided MCMC Inference for Normalized Random Measure Mixture Models

Normalized random measures (NRMs) provide a broad class of discrete random measures that are often used as priors for Bayesian nonparametric models. Dirichlet process is a well-known example of NRMs. Most of posterior inference methods for NRM mixture models rely on MCMC methods since they are easy to implement and their convergence is well studied. However, MCMC often suffers from slow converg...

متن کامل

Bayesian Learning in Undirected Graphical Models: Approximate MCMC Algorithms

Bayesian learning in undirected graphical models—computing posterior distributions over parameters and predictive quantities— is exceptionally difficult. We conjecture that for general undirected models, there are no tractable MCMC (Markov Chain Monte Carlo) schemes giving the correct equilibrium distribution over parameters. While this intractability, due to the partition function, is familiar...

متن کامل

A Parallel Solution to the Extended Set Union Problem with Unlimited Backtracking

In this paper, we study on the EREW-PRAM model a parallel solution to the extended set union problem with unlimited backtracking which maintains a dynamic partition Π of an n-element set S subject to the usual operations Find, Union, Backtrack and Restore as well as the new operations SetUnion, MultiUnion. The SetUnion operation is a special case of the commonly known Union operation aimed to u...

متن کامل

Which Spatial Partition Trees are Adaptive to Intrinsic Dimension?

Recent theory work has found that a special type of spatial partition tree – called a random projection tree – is adaptive to the intrinsic dimension of the data from which it is built. Here we examine this same question, with a combination of theory and experiments, for a broader class of trees that includes k-d trees, dyadic trees, and PCA trees. Our motivation is to get a feel for (i) the ki...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015